37 research outputs found
Multi-modal Blind Source Separation with Microphones and Blinkies
We propose a blind source separation algorithm that jointly exploits
measurements by a conventional microphone array and an ad hoc array of low-rate
sound power sensors called blinkies. While providing less information than
microphones, blinkies circumvent some difficulties of microphone arrays in
terms of manufacturing, synchronization, and deployment. The algorithm is
derived from a joint probabilistic model of the microphone and sound power
measurements. We assume the separated sources to follow a time-varying
spherical Gaussian distribution, and the non-negative power measurement
space-time matrix to have a low-rank structure. We show that alternating
updates similar to those of independent vector analysis and Itakura-Saito
non-negative matrix factorization decrease the negative log-likelihood of the
joint distribution. The proposed algorithm is validated via numerical
experiments. Its median separation performance is found to be up to 8 dB more
than that of independent vector analysis, with significantly reduced
variability.Comment: Accepted at IEEE ICASSP 2019, Brighton, UK. 5 pages. 3 figure
Raking the Cocktail Party
We present the concept of an acoustic rake receiver---a microphone beamformer
that uses echoes to improve the noise and interference suppression. The rake
idea is well-known in wireless communications; it involves constructively
combining different multipath components that arrive at the receiver antennas.
Unlike spread-spectrum signals used in wireless communications, speech signals
are not orthogonal to their shifts. Therefore, we focus on the spatial
structure, rather than temporal. Instead of explicitly estimating the channel,
we create correspondences between early echoes in time and image sources in
space. These multiple sources of the desired and the interfering signal offer
additional spatial diversity that we can exploit in the beamformer design.
We present several "intuitive" and optimal formulations of acoustic rake
receivers, and show theoretically and numerically that the rake formulation of
the maximum signal-to-interference-and-noise beamformer offers significant
performance boosts in terms of noise and interference suppression. Beyond
signal-to-noise ratio, we observe gains in terms of the \emph{perceptual
evaluation of speech quality} (PESQ) metric for the speech quality. We
accompany the paper by the complete simulation and processing chain written in
Python. The code and the sound samples are available online at
\url{http://lcav.github.io/AcousticRakeReceiver/}.Comment: 12 pages, 11 figures, Accepted for publication in IEEE Journal on
Selected Topics in Signal Processing (Special Issue on Spatial Audio
Pyroomacoustics: A Python package for audio room simulations and array processing algorithms
We present pyroomacoustics, a software package aimed at the rapid development
and testing of audio array processing algorithms. The content of the package
can be divided into three main components: an intuitive Python object-oriented
interface to quickly construct different simulation scenarios involving
multiple sound sources and microphones in 2D and 3D rooms; a fast C
implementation of the image source model for general polyhedral rooms to
efficiently generate room impulse responses and simulate the propagation
between sources and receivers; and finally, reference implementations of
popular algorithms for beamforming, direction finding, and adaptive filtering.
Together, they form a package with the potential to speed up the time to market
of new algorithms by significantly reducing the implementation overhead in the
performance evaluation step.Comment: 5 pages, 5 figures, describes a software packag
Pruned Continuous Haar Transform of 2D Polygonal Patterns with Application to VLSI Layouts
We introduce an algorithm for the efficient computation of the continuous
Haar transform of 2D patterns that can be described by polygons. These patterns
are ubiquitous in VLSI processes where they are used to describe design and
mask layouts. There, speed is of paramount importance due to the magnitude of
the problems to be solved and hence very fast algorithms are needed. We show
that by techniques borrowed from computational geometry we are not only able to
compute the continuous Haar transform directly, but also to do it quickly. This
is achieved by massively pruning the transform tree and thus dramatically
decreasing the computational load when the number of vertices is small, as is
the case for VLSI layouts. We call this new algorithm the pruned continuous
Haar transform. We implement this algorithm and show that for patterns found in
VLSI layouts the proposed algorithm was in the worst case as fast as its
discrete counterpart and up to 12 times faster.Comment: 4 pages, 5 figures, 1 algorith